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The invention relates to methods for assaying exogenous protease activity in a host cell transformed with nucleotide sequences 
encoding that protease and a specialized substrate. It also relates to methods for assaying endogenous protease activity in a host cell 
transformed with nucleotide sequences encoding a specialized substrate. When these nucleotide sequences are expressed, the exogenous 
or endogenous protease cleaves the substrate and releases a polypeptide that is secreted out of the cell, where it can be easily quantitated 
using standard assays. The methods and transformed host cells of this invention are particularly useful for identifying inhibitors of the 
exogenous and endogenous proteases. If the protease is a protease from an infectious agent, inhibitors identified by these methods are 
potential pharmaceutical agents for the treatment or prevention of infection by that agent. 
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METHODS, NUCLEOTIDE SEQUENCES AND HOST CELLS FOR 
ASSAYING EXOGENOUS AND ENDOGENOUS PROTEASE ACTIVITY 

TECHNICAL FIELD OF INVENTION 
5 The invention relates to methods for assaying 

exogenous protease activity in a host cell transformed 
with nucleotide sequences encoding that protease and a 
specialized substrate. It also relates to methods for 
assaying endogenous protease activity in a host cell 
10 transformed with nucleotide sequences encoding a 

specialized substrate. When these nucleotide sequences 
are expressed, the exogenous or endogenous protease 
cleaves the substrate and releases a polypeptide that is 
set:reted out of the' cell, where it can be easily 
15 quantitated using standard assays. The methods and 

transformed host cells of this invention are .particularly 
useful for identifying inhibitors of the exogenous and 
endogenous proteases. If the protease is a protease from 
an infectious agent or is characteristic of a diseased 
20 state, inhibitors identified by these methods are 
potential pharmaceutical agents for treatment or 
prevention of the disease. 

BACKGROUND ART 

Proteases play an important role in the 
25 regulation of many biological processes. They also play 
a major role in disease,. In particular, proteolysis of 
primary polypeptide precursors is essential to the 
replication of several infectious viruses, including HIV 
and HCV, These viruses encode proteins that are 
30 initially synthesized as large polyprotein precursors 
Those precursors are ultimately processed by the viral 
protease to mature viral proteins. In light of this, 
researchers have begun to concentrate on inhibition of 
viral proteases as a potential treatment for certain 
35 viral diseases. 

Proteases also play a role in non-infectious 
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diseases. For example, changes in normal cellular 
function may cause an undesirable increase or decrease in 
proteolytic activity. This often leads to a disease 
state . 

5 The ability to detect viral or mutant protease 

activity in a quick and simple assay is important in the 
biochemical characterization of these proteases and in 
the screening and identification of potential inhibitors. 
Several of these assays have been described in the art. 
10 T. M. Block et al., Antimicrob. Agents 

Chemother., 34, pp. 2337-41 (1990) described a prototype 
assay for screening potential HIV protease inhibitors. 
This assay involved cloning the HIV protease recognition 
sequence into the tetracycline resistance gene (Tef) of 
15 pBR322 and cotransf roming E. coli with the modified Tet** 
gene and the gene encoding the HIV protease. Co- 
expression of these two genes caused tetracycline 
sensitivity. Potential inhibitors were identified by the 
ability to restore tetracycline resistance to the 
20 transformed bacteria. 

E. Sarubbi et al . , FEES Lett. , 279, pp. 265-69 
(1991) described another assay for detecting HIV protease 
inhibitors that utilized a HIV-1 Gag-I5-galactosidase 
fusion protein and a monoclonal antibody that bound to 
25 the fusion protein in the gag region. Coexpression of 

the HIV protease and the fusion protein lead to cleavage 
of the latter and abolished monoclonal antibody binding. 
Potential inhibitors were identified by increased binding 
of the monoclonal antibody to the fusion protein. 

T. A. Smith et al., Proc. Natl. Acad. Sci. USA ,- 
88, pp. 5159-62 (1991), B. Dasmahapatra et al . , Proc. 
Natl. Acad. Sci. USA , 89, pp. 4159-62 (1992) and M. G. 
Murray et al . , Gene , 134, pp. 123-28 (1993) each 
described protease assay systems utilizing the yeast GAL4 
35 protein. Each of these authors described inserting a 

protease cleavage site in between the DNA binding domain 
and the transcriptional activating domain of GAL4. 



wo 96/34976 



PCTAJS96/06070 



-3- 

Cleavage of that site by a coexpressed protease renders 
GAL4 transcriptionally inactive leading to the inability 
of the transformed yeast to metabolize galactose. 

H,-D, Liebig et al., Proc. Natl. Acad. Sci . 
5 USA, 88, pp. 5979-83 (1991) disclosed the use of a fusion 
protein consisting of a self-cleaving protease fused to 
the a fragment of fl-galactosidase to assay protease 
activity. Active forms of the protease cleaved 
themselves off of the fusion protein and the resulting 
10 protein was able to carry out a-complementation . Fusions 
containing inactive protease were unable to perform a- 
complementation . 

Y. Komoda et al., J. Virol. , 68, pp. 7351-57 
(1994) described an assay to identify HCV protease 
15 cleavage sites within the HCV precursor polyprotein. 
These authors created chimeric proteins comprising 
various portions of the HCV precursor polyprotein 
inserted in between the E. coli maltose binding protein 
and dihydrofolate reductase. If the HCV portion of ^-hese 
20 chimeras contained a cleavage site, the chimera would be 
cleaved when it was coexpressed with HCV protease in E. 
coli. Cleavage of the chimera was determined by SDS- 
polyacrylamide gel electrophoresis of E. coli lysates." 

Y. Hirowatari et al.. Anal. Biochem. , 225, pp. 
25 113-120 (1995) described another assay to detect HCV 
protease activity. In this assay, the substrate, HCV 
protease and a reporter gene are cotransf ected into COS 
cells. The substrate is a fusion protein consisting of 
(HCV NS2) - (DHFR) - (HCV NS3 cleavage site) -Taxi. The 
30 reporter gene is chloramphenicol transferase (CAT) under 
control of the HTLV-1 long terminal repeat (LTR) and 
resides in the cell nucleus following expression. The 
uncleaved substrate is expressed as a membrane-bound 
protein on the surface of the endoplasmic reticulum due 
35 to the HCV NS2 portion. Upon cleavage, the released Taxi 
protein translocates to the nucleus and activates CAT 
expression by binding to the HTLV-1 LTR. Protease 
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activity is determined by measuring CAT activity in a 
cell lysate. 

Despite these developments, no one has yet 
developed a protease assay system that can be carried out 
5 with higher eukaryotic cells and is both quantitative and 
does not require cell lysis prior to quantitation. 
Avoiding cell lysis prior to quantitation is desirable in 
that the assay may be performed more rapidly and with 
less manipulation. Also, lysis can often lead to 
10 aberrant results. Thus, there is a need for an accurate 
and quantitative cellular-based protease assay that can 
be carried out in a higher eukaryotic cell without cell 
lysis . 

SUMMARY OF THE IIWENTION 

15 The present invention fulfills this need by 

providing methods for assaying exogenous protease 
activity in a host cell expressing that protease. The 
methods involve utilizing a host cell expressing a first 
nucleotide sequence encoding an exogenous protease and a 

20 second nucleotide sequence encoding an artificial 

substrate for that protease. The artificial substrate 
comprises a cleavage site for the protease situated at or 
near the natural maturation site of a pre-polypeptide, 
part of which is secreted following proteolytic 

25 processing. When the host is grown under conditions that 
cause expression of the first and second nucleotide 
sequences, the exogenous protease cuts the artificial 
substrate at the cleavage site, releasing the mature 
polypeptide which is secreted into the growth media. The 

30 growth media is then isolated and assayed for the mature 
polypeptide . 

Alternatively, the invention may be utilized to 
assay endogenous proteases, especially when quantitation 
of those proteases is difficult due to the inability to 

35 detect or distinguish between the cleaved and uncleaved 
native substrate. 



SUBSTITUTE SHEET (RULE 23) 
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According to one aspect of the invention, the 
assay is used to quantitate an exogenous viral protease. 
Such assays are particularly useful as replacements for 
current viral protease assays that require the use of 
5 intact, infectious virus or where no simple viral model 
is available to detect viral protease activity. These 
assays may be used to identify and assay potential 
inhibitors of viral proteases which, in turn, may be used 
as pharmaceutical agents for the treatment or prevention 
10 of viral disease. 

This invention also provides host cells 
transformed with nucleotide sequences encoding an 
endogenous protease and a corresponding substrate, as 
well as those transformed with a specialized substrate 
15 for an endogenous protease. These hosts may be used in - 
the methods of this invention. 

BRIEF DESCRIPTION OF THE DRAWINGS 

figure 1 depicts the structure of pcDL-SRa296. 
Figure 2 depicts the structure of a derivative 
20 of pKV containing the pre-IL-113 coding sequence. 

Figure 3, panel A, is an immunoblot of cell 
lysates from cells transfected with a NS3-wild-type or 
NS3-mutant NS3-4A-4B-IL1J5 or cotransf ected with a NS3- 
mutant NS3-4A-4B-ILlfl and a NS3 (1-180) construct probed 
25 with an anti-NS3 antibody. Figure 3, panel B, is an 

immunoblot of the same cell lysates probed with an anti- 
IL-lIi antibody. 

Figure 4 depicts the immunoprecipitation of the 
media from ^^S-labelled cells transfected with either a 
30 NS3-wild-type or NS3-mutant NS3-4A-4B-ILlfl construct with 
an anti-IL-lJi antibody. 

Figure 5 is an immunoblot of cell lysates from 
cells co-transf ected with NS3-4A and either a NS5A/5B- or 
CSM-containing pre-ILlii substrate probed with an anti-IL- 
35 IB antibody. 

Figure 6 depicts the immunoprecipitation of the 



wo 96/34976 



PCTAJS96/06070 



- D- 

media from ^^S-labelled cells co-transf ected with NS3-4A 
and either a NS5A/5B- or CSM-containing pre-ILlfi 
substrate with an anti-IL-lJ3 antibody. 

Figure 7 depicts the inhibition of HCV NS3 
5 protease cleavage of pre-IL-113* by varying concentrations 
of VH16075 and VH15924. 

DETAILED DESCRIPTION OF THE INVENTION 

The present invention provides a method for 
assaying exogenous protease activity in a host cell 
10 comprising the steps of: 

(a) incubating a host cell transformed 
with a first nucleotide sequence encoding an exogenous 
protease and a second nucleotide sequence encoding an 
^^tificial polypeptide substrate under conditions which 

15 cause said exogenous protease and said artificial 
substrate to be expressed; 

wherein said substrate comprises: 

(i) a cleavage site for said 
exogenous protease; and 
20 (ii) a polypeptide that is secreted 

out of said cell following cleavage by said 
exogenous protease ; 

(b) separating said host cell from its 
growth media under non-lytic conditions; and 

25 (c) assaying said growth media for the 

presence of said secreted polypeptide. 

As used herein, the term "exogenous protease" 

means a protease not normally expressed by the host cell 

used in the assay. That term includes full-length 
30 proteases that are identical to those found in nature, as 

well as catalytically active fragments thereof. 

The choice of exogenous protease to be assayed 

is solely dependent upon the decision of the user. The 

only requirements are that: (1) the specificity of the 
35 enzyme in terms of what amino acid residues or sequences 

it cleaves at be known; (2) the primary structure of at 
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least the catalytically active portion of the enzyme be 
known; and (3) a nucleotide sequence encoding at least an 
enzymatically active portion of the protease exists or 
can be made and can be expressed in a heterologous host 
5 cell. 

According to a preferred embodiment, the 
exogenous protease is a protease encoded by a pathogenic 
agent. More preferred is a protease encoded by a 
pathogenic virus. Most preferably, the exogenous 
10 protease is the NS3 protease of hepatitis C virus 
("HCV") . 

HCV NS3 protease is a 70 kilodalton protein 
that is involved in the maturation of viral polypeptides 
following infection. It is a serine protease which has a 

15 Cys-X or Thr-X substrate specificity. It has also been 
shown that the protease activity of NS3 resides 
exclusively in the N-terminal 180 amino acids of the 
enzyme. Therefore, nucleotide sequences encoding 
anywhere from the first 180 amino acids of NS3 up to the 

20 full length enzyme may be utilized in the methods of this 
invention. Active fragments of other known proteases may 
also be used as an alternative to the full-length 
protease . 

According to an alternative embodiment, the 
25 invention provides a method for assaying endogenous 

protease activity in a host cell comprising the steps of: 

a) incubating a host cell transformed with a 
nucleotide sequence encoding an artificial polypeptide 
substrate under conditions which cause said artificial 

30 substrate to be expressed; 

wherein said substrate comprises: 

i) a cleavage site for said endogenous 
protease; and 

ii) a polypeptide that is secreted out of 
35 said cell following cleavage by said endogenous 

protease; 

b) separating said host cell from its growth 
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media under non-lytic conditions; and 

c) assaying said growth media for the 
presence of said secreted polypeptide. 

The term "endogenous protease", as used 
5 throughout this application, refers to a proteases that 

is normally expressed by the host cell. It includes both 
wild type proteases, as well as naturally occurring 
mutant proteases with increased or decreased activity. 

According to the invention, tho artificial 
10 polypeptide substrate used in the methods must comprise a 
cleavage site for the protease to be assayed; and must be 
secreted out of the cell following cleavage by that 
protease. Preferably, the DNA encoding the artificial 
substrate is derived from a gene or cDNA encoding a 
15 naturally occurring polypeptide that is normally cleaved 
and then secreted out of a cell, but not necessarily 
cleaved by the cell utilized in the assay. 

The DNA encoding that polypeptide is then 
modified by inserting, in frame with the polypeptide 
20 coding sequence, nucleotides encoding a cleavage site 
that is recognized by the exogenous protease to be 
tested. If the cell utilized in the assay is capable of 
cleaving the substrate at its native cleavage site, then 
the nucleotides encoding the polypeptide's native 
25 cleavage site must be altered so as to render it 
uncleavable by endogenous proteases. 

The protease cleavage site in the artificial 
substrate is preferably inserted within 60 amino acids on 
either side of the native cleavage site. Preferably, the 
30 artificial cleavage site is inserted N-terminal to the 
native cleavage site. Alternatively, the protease 
cleavage site can be created by mutating the native 
polypeptide sequence. Such mutation is preferably 
performed on a sequence within 60 amino acids, more 
35 preferably N-terminal to the native cleavage site and 

within 8-10 amino acids of the native cleavage site; or 
is a mutation of the native cleavage site itself. 
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Alteration of the native cleavage site to 
render it uncleavable by the host cell may be achieved, 
if necessary, by insertion, deletion or mutation of 
nucleotides at that site. 
5 Insertion of the protease cleavage site into 

the substrate and alteration of its native cleavage site 
may be accomplished by any combination of a number of 
recombinant DNA techniques well known in the art, such as 
site directed mutagenesis or standard restriction 

10 digest/ligation cloning techniques. Alternatively, the 
DNA encoding all or part of the artificial substrate may 
be produced synthetically using a commercially available 
automated oligonucleotide synthesizer. Regardless of the 
techniques used to insert the protease cleavage site into 

15 the substrate polypeptide or alter its native cleavage 
site, it is crucial that the reading frame of the 
substrate polypeptide remain intact, without the 
insertion of stop codons . 

The choice of secretable polypeptide from which 

20 the artificial substrate is derived may be selected from 
any pre-polypeptide that can be cleaved by and the 
resulting mature polypeptide secreted out of the host 
cell used for the assay, but is not normally present in 
that cell. For use in eukaryotic cells there are two 

25 main categories of pre-polypeptide from which the choice 
can be made* 

The first and preferred category comprises pre- 
polypeptides that are expressed and cleaved in the 
cytoplasmic compartment. Among these proteins are 

30 interleukin-lB (IL-lfl), inter leukin-la (IL-la) , basic 

fibroblast growth factor (bFGF) and endothelial-monocyte 
activating polypeptide II (EMAP-II) , The advantage of 
using cytoplasmic pre-polypeptides is that there is a 
much greater likelihood that the protease and the 

35 artificial substrate will share the same subcellular 

compartment. This is because most proteases of interest 
are also cytoplasmic proteins and thus will have access 

SUBSTITUTE SHEET (RULE 23) 
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to the artificial substrate. 

The second category of pre-polypeptides that 
may be used to create artificial substrates used in the 
methods of this invention are those that are expressed on 
the cell surface through the organellar secretory pathway 
and are retained on the cell surface. Such substrates 
are useful to assay endogenous and exogenous cell 
membrane proteases, as well as exogenous proteases that 
are similarly engineered to be cell membrane proteins. 
The technique of creating a cell membrane protease or 
substrate involves cloning a leader peptide (i.e., signal 
sequence) onto the N-terminus of the substrate or 
protease and a hydrophobic, membrane anchor sequence 
(either a transmembrane domain or a glycosylphophatidyl- 
15 inositol anchor sequence) onto the C-terminus, The 

resulting substrate is a cell membrane protein with an 
extracellularly located cleavage site. When cleaved by a 
cell membrane protease on the same or a neighboring cell, 
the secreted polypeptide portion of the substrate is 
20 released into the media. 

Examples of sequences that may be used for 
anchoring these proteins in the membrane are the 
transmembrane domains of TNFa precursor [Nedopsasov et 
al.. Cold Spring Harb, Symp. Quant. Biol. . 51, pp. 611-24 
25 (1986)], SP-C precursor [Keller et al., Biochem J. , 277, 
pp. 493-99 (1991)], or alkaline phosphatase [Berger et 
al., Proc. Natl. Acad. Sci . USA . 86, pp. 1457-60 (1989)]. 

Techniques for cloning a signal sequence onto a 
cytoplasmic protein have been well documented [see, for 
30 example, Kizer and Trosha, BBRC , 174, pp. 586-92 (1991); 
Jost et al., J. Biol. Chem. , 269, pp. 26267-72 (1994) 
(expression and secretion of functional single chain Fv 
molecules using immunoglobulin light chain leader 
sequence); and Sasada et al . , Cell Structure Function . 
35 13, pp. 129-41 (1988) (secretion of human EGF and IgE in 
mammalian cells using an IL-2 leader sequence)], as have 
techniques for cloning a transmembrane anchor sequences 
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onto cytoplasmic proteins [Berger et al . , supra ; Oda et 
al., Biochem J. , 301, pp. 577-83 (1984)]. By combining 
these two techniques, the protease or substrate of 
interest can be converted from a cytoplasmic protein into 
5 a cell surface membrane protein. 

In order to insure that the sut:;trate and 
protease will have access to one another and according to 
an alternate embodiment of the invention, the artificial 
substrate and an exogenous protease to be assayed may be 
10 encoded as part of a single polyprotein. That 

polyprotein may be a cytoplasmic or a membrane protein, 
as long as the substrate and protease domains reside in 
the same cellular compartment. 

The choice of host cell to use in this method 
15 is virtually unlimited. Any cell that can grow in 

culture, be transformed or transfected with heterologous 
nucleotide sequences and can express those sequence may 
be employed in this method. These include bacteria, such 
as E. coli. Bacillus , yeast and other fungi, plant C'=»lls, 
20 insect cells, mammalian cells. In addition, expression 
of either of those sequences in higher eukaryotic host 
cells may be transient or stable. Preferably, the host 
cell is a higher eukaryotic cell that is incapable of" 
cleaving the substrate at its native cleavage site. 
25 Preferably, the host cell is a mammalian cell. Most 
preferably, the host cell is a COS cell. 

It will be apparent that the specific choice of 
cell is governed by the particular protease to be assayed 
and by the particular artificial substrate used. In 
30 embodiments that assay an exogenous protease, one obvious 
limitation is that the endogenous cellular enzymes of the 
chosen host must be unable to cleave the artificial 
substrate to any significant extent. The endogenous rate 
of artificial substrate cleavage may be determined by 
35 transforming the selected host cell with only the 

nucleotide sequence coding for the artificial substrate 
and then growing that host under conditions which cause 
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expression of that nucleotide sequence and which would 
cause expression of the exogenous protease-encoding 
nucleotide sequence if that sequence were present. The 
growth media of the cell is then assayed for the presence 
5 of the secreted polypeptide portion of the substrate. In 
assays that measure exogenous protease activity, control 
cells (no exogenous protease expressed) should secrete 
less than 10% of the total amount of expressed substrate 
(due to endogenous cleavage and, in assays that do not 

10 distinguish between cleaved and uncleaved substrates, 

leeching of uncleaved substrate out of the cell) in order 
to be useful in the methods of this invention. When an 
endogenous protease is assayed, a controls for non- 
specific substrate cleavage is a cell transformed with a 

15 substrate that contain a mutation at the cleavage site. 
This mutation renders the substrate uncleavable by the 
specific endogenous protease being assayed, but still 
susceptible to non-specific cleavage. As with assays for 
exogenous proteases, control cells should secrete less 

20 than 10% of the total amount of expressed substrate. 

In order to quantitate the protease activity, 
the amount of secreted substrate polypeptide is measured. 
Quantitation may be achieved by subjecting the growth 
media to any of the various standard assay procedures 

25 that are well known in the art. These include, but are 
not limited to, immunoblotting, ELISA, 
immunoprecipitation, RIA, other colorimetric assays, 
enzymatic assay or bioassay. Quantitation techniques 
that employ antibodies, preferably utilize antibodies 

30 that have low cross-reactivity with the uncleaved 

substrate. Preferably cross-reactivity is less than 20% 
and more preferably less than 5%. 

According to another embodiment, the present 
invention provides a method of screening for protease 

35 inhibitors. In this method, the above-described assay is 
carried out in the presence and absence of potential 
inhibitors of the protease. When the assays of this 



SUBSTITUTE SHEET (RULE 28) 
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invention are performed using cells which transiently 
express the substrate and protease, the inhibitor is 
preferably added iinmediately after transfection with the 
protease and substrate-encoding DNA sequences. When 
stable transformants are used, the potential inhibitor is 
added at the beginning of the assay. The efficacy of the 
potential inhibitor (and its ability to cross the cell 
membrane) is determined by comparing the amount of 
secreted substrate polypeptide present in the media of 
cells assayed in its presence versus its absence. 
Compounds which cause at least a 90% reduction in the 
amount of secreted substrate polypeptide are potentially 
useful protease inhibitors. 

In order that the invention described herein 
15 may be more fully understood, the following examples are 
set forth. It should be understood that these examples 
are for illustrative purposes only and are not to be 
construed as limiting this invention in any manner. 



10 



20 



EXAMPLE 1 

Construction Of Expression Plasmids 
A. HCV NS3 Protease 

We cloned the nucleotide sequence coding for 
the entire, intact HCV NS3 protease, an NS3-4A 
polyprotein or a truncated NS3 consisting of amino acids 
25 1 to 180 into the mammalian expression plasmid pcDL-SRa 
[Y. Takebe et al . , Mol. Cell. Biol. , 8, pp. 466-72 
(1988)]. That plasmid contains an SV40 origin of 
replication and an HTLV LTR enhancer/promoter sequence 
which ultimately drives the high level expression of the 
30 NS3 coding sequences (Figure 1). 

The respective NS-3 coding fragments (full 
length NS3, NS3-4A polyprotein or truncated NS3 (amino 
acids 1-181) were obtained by PCR of the corresponding 
portions of a full length HCV H strain cDNA (SEQ ID 
35 N0:1). For each of the three coding fragments the 
following 5' primer was used (SEQ ID N0:2) : 

5 ' GGACTAGTCTGCAGTCTAGAGCTCCATGGCGCCCATCACGGCGTACG3 ' , The 
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f ragment-specif ic 3* primers used were: 
NS3 - (SEQ ID NO: 3) : 

3 ' GAAGATCTGAATTCTAGATTTTACGTGACGACCTCCACGTCGGC5 ' ; 
NS3-4A - (SEQ ID NO: 4): 
5 3 ' GAAGATCTGAATTCTAGATTTTAGCACTCTTCCATCTCATCGAA5 ' ; and 
NS3(1-181) - (SEQ ID N0:5): 

3 ' GAAGATCTGAATTCTAGATTTTAGGATCTCATGGTTGTCTCTAGG5 ' . These 
primers produced PGR- amplified fragments containing 
multiple restriction sites at either end for ease of 
10 cloning. 

In order to ligate the fragments to the vector, 
the vector was first cleaved with PstI and EcoRI to 
remove a small fragment. The cut vector was then 
purified and ligated to the respective Pstl/EcoRI cut NS3 

15 protease-encoding fragment, 
IL-ia/NS3 Substrate 

A derivative of plasmid pKV containing the pre- 
IL-1J5 coding sequence has been described by P. K. Wilson 
et al.. Nature , 370, pp. 253-70 (1994). That plasmid 

20 contains the SV40 origin of replication and the early 

promoter. The pre-IL-115 sequence was cloned between the 
Spel and Bglll sites shown in Figure 2, 

We inserted a double stranded synthetic DNA 
fragment (SEQ ID NO: 6) which encoded 20 amino acids: SEQ 

25 ID NO: 7: GADTEDWCCSMSYTWTGVH and contained linkers at 
both ends that included an ApaLl restriction site. The 
DNA was cloned into the ApaLl site in pre-IL-li5 (between 
the codons for amino acids Hisns and Aspug) / immediately 
upstream of the native cleavage site (located between 

30 Aspii6 and Alaii7) . The first 18 amino acids of the insert 
correspond to the HCV peptide 5A/5B cleavage site. The 
last two amino acids are encoded by the linker. The 
inserted DNA maintained the reading frame of the native 
pre-IL"lB protein. The resulting substrate is referred 

35 to throughout the application as "pre-IL-lJi* " . 

NS3 cleaves the inserted peptide in between the 
cysteine and serine residues. Because the COS cells we 
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utilized in this assay were incapable of cleaving pre-IL- 
Ifl (data not shown) , we did not have to knock out the 
native pre-IL-lli cleavage site. 

In another construct, we performed site 
5 directed mutagenesis to alter the native pre-IL-lB 
cleavage site of Aspne-AlanT-PrOiie to Cys-Ser-Met, a 
conserved recognition sequence for NS3. This construct 
is referred to throughout the application as "pre-IL- 
liiB (CSM) " . 
10 C. NS3-4A-A4B-IL-1I5 

In order to create a single fusion polypeptide 
that encoded both the exogenous protease and the 
polypeptide substrate, we utilized the fact that NS3 can 
autoprocess (cleave) an NS3-4A-4B polyprotein at both the 
15 NS3-4a and 4A-4B junctions. 

We isolated a DNA fragment that encoded NS3-4A 
and the first 60 amino acids of 4B through PGR using the 
HCV strain H cDNA referred to above (SEQ ID N0:1) and the 
following primers; SEQ ID NO: 8: 
20 5 • GGACTAGTCTGCAGTCTAGAGCTCCATGGCGCCCATCACGGCGTACG3 ' and 
SEQ ID NO: 9: 3 ' GGACGCGGTCTGCAGGAGGCCGAGGGC5 ' The PGR 
products were digested with PstI and Xbal prior to 
cloning , 

The mature IL-lJi portion of the construct 
25 (amino acids 117-269 of SEQ ID NO: 11) was created by PGR 
cloning of full length pre-IL-l/i cDNA (SEQ ID NO: 10) 
using the following primers: 

SEQ ID N0:12: 5 ' GTCGGGGTGGTGGAGGGAGCTGTAGGATCAGTGAAG3 ' ; 

and SEQ ID NO: 13: 3 ' GGGAATTCTAGATTTTAGGAAGACACAAATTG5 ' . 
30 These PGR products were digested with PstI and EcoRI 

prior to cloning. 

The NS3-4A-A4B and IL-lli fragments were then 

ligated together with Xbal/EcoRI digested pcDL-SRa to 

obtain the desired construct. 
35 As a control we created a mutant NS3 protease 

fusion protein construct. This construct was identical 

to the one described above, except that the NS3 portion 
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was created by PGR using the same primers and the cDNA of 
the NS3 active site mutant S1165A [A. Grakoui et al,, J. 
Virol w 67, pp. 2832-43 (1993)]. The NS3 active site 
mutant contains a serine-to-alanine mutation in its 
5 active site, rendering the enzyme inactive. 



EXAMPLE 2 

Transfection Of COS Cells And Assay Of Secreted IL-lfi 
The expression plasmid constructs described in 
Example 1 were transfected into COS-7 cells using the 

10 DEAE-Dextran transfection protocol [Gu et al . , Neuron , 5, 
pp. 147-57 (1990)]. COS cells in 6-well clusters or 100 
mm dishes at 50% confluency were transfected with 4-10 ]ig 
of the desired plasmid in a DEAE-Dextran solution. 
Following transfection, the cells were incubated an 

15 additional 48 hours before assaying. 

The processing of pre-IL-lfi or NS3-4A-A4B-IL-1B 
fusion protein and subsequent secretion of mature IL-li5 
into the media was measured by ELISA of IL-lli using an 
antibody that was specific for mature IL-lii (approx. 3% 

20 cross-reactivity with pre-IL-lB) . We analyzed expression 
by harvesting the COS cells in ice-cold phosphate 
buffered saline, lysing the cells in a 0.1% Triton X-100 
buffer and centrifuging the lysate to remove cell debris. 
The lysates were then analyzed by SDS-PAGE and 

25 immunoblotting using an IL-lfi antibody (Genzyme) and an 
NS3 antibody. Alternatively, expression, processing and 
secretion was analyzed by labelling the cells for 24 
hours in the presence of [^^S] -methionine, incubating the 
cells for an additional 24 hours after the label was 

30 removed and then utilizing immunoprecipitation and SDS- 
PAGE to analyze the polypeptides. 



EXAMPLE 3 

NS3-Specific Processing Of An NS3-4A-A4B-IL-lf5 Fusion 
Protein And Secretion Of A4B-IL-lfi Into The Media 



35 



Transfectants expressing the NS3-4A-A4B-IL-1J3 
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fusion protein autoprocessed that protein at both the 
NS3-4A and 4A-4B junctions. The cell lysates of these 
transf ectants were subjected to Western blotting 
utilizing an anti-NS3 antibody. Figure 3, panel A, Wt-1 
5 and Wt-2 lanes, shows that this experiment produced a 

doublet band in the 70 kD area, present only as a single 
band in the untransf ormed control cells (panel A, No DNA 
lane) . The second band of the doublet in the Wt-1 and 
Wt-2 lanes corresponds to the size of mature NS3. A 

10 transfectant that expressed an inactive mutant NS3- 

containing NS3-4A-A4B-IL-li5 fusion protein demonstrated 
no 70 kDa doublet and therefore was not autoprocessed 
(NS3 mutant lane) . A transfectant that co-expressed the 
same mutant fusion protein together with a truncated, but 

15 active NS3 — NS3 (1-180) — was also analyzed. 

Surprisingly, the mutant fusion protein did not appear to 
be cleaved by NS3 (1-180), as indicated by the lack of a 
doublet in the 70 kDa region (NS3 mutant + NS3 (1-180) 
lane) . However, a 20 kDa band representing the truncated 

20 NS3 was detected in that lysate, as indicated by the 
NS3 (1-180) arrow. 

A similar experiment performed on cell lysates 
utilizing an mature IL-lfi-specif ic antibody demonstrated 
the presence of a band corresponding in size to the A4B- 

25 IL-lli portion of the fusion protein in both the NS3-4A- 

A4B-IL-1J5 transf ectants (Figure 3, panel B, Wt-1 and Wt-2 
lanes) and, to a lesser degree in the NS3 mutant fusion 
protein/NS3 (1-180) cotransf ectant . Virtually no IL-IB 
was detected in the NS3 mutant fusion protein expressing 

30 transfectant (IL-lfi arrow) , These experiments confirm 
that the cleavage observed in the wild type NS3-4A-A4B- 
IL-lfi transf ectants was dependent upon NS3 protease 
activity. Thus, we had proof that cleavage of this 
fusion protein was essentially NS3-dependent and not 

35 caused by some endogenous protease. 

Secretion of the cleaved substrate was 
determined by assaying culture media with a commercially 
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available mature IL-lB-specif ic ELISA assay (R&D Systems, 
Minneapolis, MN) . For the wild-type NS3-containing 
construct we detected a concentration of 2,5 ug/ml of IL- 
1J5 in the medium. We detected less than 0.25 yg/ml of 
IL-IB in the media of cells transfected with the mutant 
NS3-containing construct, Immunoprecipi tation experiment 
utilizing the same anti-IL-113 antibody demonstrated the 
presence of A4B-IL-li5 in the media of cells containing 
the wild type NS3-containing construct, but none from the 
mutant NS3-containing construct (Figure 4), thus 
confirming these results. 

EXAMPLE 4 

NS3-Specific Processing Of Mutated Pre-IL-lI5 
Containing An Artificial Cleavage Site And 
1^ Secretion Of IL-lfi Into The Media 

We confirmed that NS3 protease can cleave 
artificial substrates other than an HCV polypeptide by 
cotransfecting COS cells with the NS3-4A and either of 
the pre-IL-lfl-containing artificial substrate expression 
constructs described in Example IC, 

Co-expression of the NS3-4A and pre-IL-lli* 
substrate sequences resulted in rapid cleavage of the 
substrate and concomitant secretion of a 19 Kd IL-IB into 
the media. Secretion was quantitated using an ELISA 
25 specific for the processed form of IL-lli. An immunoblot 
of cell lysates from these transf ormants demonstrated the 
presence of both cleaved and uncleaved substrate (Figure 
5, NS3-4A + IL-lfi* lane) . The same experiment was 
performed using cells that were metabolically labelled 
30 with [^^S] -methionine, followed by immunoprecipi tation of 
the media with the processed IL-lB-specif ic antibody. 
The results of the immunoprecipi tation experiment are 
shown in Figure 6, NS3-4A + pre-IL-115* lanes. 

When we coexpressed NS3-4A and the pre-IL- 
35 Ifl (CSM) sequences, we also observed cleavage of the 

substrate at the predicted Cys,,^-Ser,,, site. Both cleaved 
and uncleaved forms were observed in cell lysates using 
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immunoblotting specific for IL-li3 (Figure 5, NS3-4A + IL- 
115 (CSM) lane). Immunoprecipitation of the media from 
[^^S] -methionine labelled cells also demonstrated the 
presence IL-lI5-containing cleavage product, but less than 
that observed for the 5A- SB-containing pre-IL-lf5 
substrate (Figure 6, NS3-4A + pre-IL-lli (CSM) lane). 

EXAMPLE 5 
Assay of NS3 Inhibitors 

We tested the potential of compounds VH-15924 
and VH-16075 as HCV NS3 protease inhibitors in our 
assays . 

Transfectants expressing the NS3-4A-A4B-IL-lii 
were grown in the presence of varying amounts VH-15924. 
Even at concentrations as high as 100 yM, we detected the 
15 presence of the cleavage product, A4B-IL-113, in the 

media. This indicated that VH-15924 was not an effective 
inhibitor of NS3 protease. 

We also assayed the inhibition of cleavage and 
secretion of pre-IL-115* substrate by both VH-15924 and 
20 VH-16075. VH-16075 inhibited cleavage and secretion with 
an IC50 of 4 yM. As in the previous experiment, VH-15924 
did not completely inhibit cleavage/secretion even at 
concentrations of 100 \M (Figure 7) . 

While I have hereinbefore presented a number of 
25 embodiments of this invention, it is apparent that my 
basic construction can be altered to provide other 
embodiments which utilize the methods of this invention. 
Therefore, it will be appreciated that the scope of this 
invention is to be defined by the claims appended hereto 
rather than the specific embodiments which have been 
presented hereinbefore by way of example. 
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SEQUENCE LISTING 



(1) GENERAIi INFORMATION: 

(i) APPLICANT: Su, Michael 

(ii) TITLE OF INVENTION: METHODS AND HOST CELLS FOR ASSAYING 
EXOGENOUS AND ENDOGENOUS PROTEASE ACTIVITY 

(iii) ^aJMBER OF SEQUENCES: 13 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: Fish & Neave 

(B) STREET: 1251 Avenue of the Americas 

(C) CITY: New York 

(D) STATE: New York 

(E) COUNTRY: United States of America 

(F) ZIP: 10020 

<v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentin Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Haley Jr, James F 

(B) REGISTRATION NUMBER: 27,794 

(C) REFERENCE/DOCKET NUMBER: VPI/95-01 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 212-596-9000 

(B) TELEFAX: 212-596-9090 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9401 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 3420.. 5312 

(D) OTHER INFORMATION: /product= "NS3 protease" 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 5313.. 5474 
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(D) OTHER INFORMATION: /produCt= "NS4A" 

(ix) FEATURE: 

(A) NAME/KEY: mat_peptide 

(B) LOCATION: 5475.. 5552 

(D) OTHER INFORMATION: /product= "truncated NS4B" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

GCCAGCCCCC TGATGGGGGC GACACTCCAC CATAGATCAC TCCCCTGTGA GGAACTACTG 60 

TCTTCACGCA GAAAGCGTCT AGCCATGGCG TTAGTATGAG TGTCGTGCAG CCTCCAGGAC 12 0 

CCCCCCTCCC GGGAGAGCCA TAGTGGTCTG CGGAACCGGT GAGTACACCG GAATTGCCAG 18 0 

GACGACCGGG TCCTTTCTTG GATAAACCCG CTCAATGCCT GGAGATTTGG GCGTGCCCCC 240 

GCAAGACTGC TAGCCGAGTA GTGTTGGGTC GCGAAAGGCC TTGTGGTACT GCCTGATAGG 300 

GTGCTTGCGA GTGCCCCGGG AGGTCTCGTA GACCGTGCAC CATGAGCACG AATCCTAAAC 360 

CTCAAAGAAA AACCAAACGT AACACCAACC GTCGCCCACA GGACGTCGAG TTCCCGGGTG 420 

GCGGTCAGAT CGTTGGTGGA GTTTACTTGT TGCCGCGCAG GGGCCCTAGA TTGGGTGTGC "4 8 0 

GCGCGACGAG GAAGACTTCC GAGCGGTCGC AACCTCGTGG TAGACGTCAG CCTATCCCCA 54 0 

AGGCACGTCG GCCCGAGGGC AGGACCTGGG CTCAGCCCGG GTACCCTTGG CCCCTCTATG 600 

GCAATGAGGG TTGCGGGTGG GCGGGATGGC TCCTGTCTCC CCGTGGCTCT CGGCCTAGCT 660 

GGGGCCCCAC AGACCCCCGG CGTAGGTCGC GCAATTTGGG TAAGGTCATC GATACCCTTA 720 

CGTGCGGCTT CGCCGACCTC ATGGGGTACA TACCGCTCGT CGGCGCCCCT CTTGGAGGCG 7 80 

CTGCCAGGGC CCTGGCGCAT GGCGTCCGGG TTCTGG7\AGA CGGCGTGAAC TATGCAACAG 8 40 

GGAACCTTCC TGGTTGCTCT TTCTCTATCT TCCTTCTGGC CCTGCTCTCT TGCCTGACTG 900 

TGCCCGCTTC AGCCTACCAA GTGCGCAATT CCTCGGGGCT TTACCATGTC ACCAATGATT 960 

GCCCTAATTC GAGTATTGTG TACGAGGCGG CCGATGCCAT CCTGCACACT CCGGGGTGTG 1020 

TCCCTTGCGT TCGCGAGGGT AACGCCTCGA GGTGTTGGGT GGCGGTGACC CCCACGGTGG 108 0 

CCACCAGGGA CGGCAAACTC CCCACAACGC AGCTTCGACG TCATATCGAT CTGCTTGTCG 1140 

GGAGCGCCAC CCTCTGCTCA GCCCTCTACG TGGGGGACCT GTGCGGGTCT GTTTTTCTTG 12 00 

TTGGTCAACT GTTTACCTTC TCTCCCAGGC GCCACTGGAC GACGCAAAGC TGCAATTGTT 12 60 

CTATCTATCC CGGCCATATA ACGGGTCATC GCATGGCATG GGATATGATG ATGAACTGGT 132 0 

CCCCTACGGC AGCGTTGGTG GTAGCTCAGC TGCTCCGGAT CCCACAAGCC ATCATGGACA 138 0 

TGATCGCTGG TGCTCACTGG GGAGTCCTGG CGGGCATAGC GTATTTCTCC ATGGTGGGGA 1440 

ACTGGGCGAA GGTCCTGGTA GTGCTGCTGC TATTTGCCGG CGTCGACGCG GAAACCCACG 1500 

TCACCGGGGG AAGTGCCGGC CACACCACGG CTGGGCTTGT TGGTCTCCTT ACACCAGGCG 1560 

CCAAGCAGAA CATCCAACTG ATCAACACCA ACGGCAGTTG GCACATCAAT AGCACGGCCT 1620 
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TGAACTGCAA CGATAGCCTT ACCACCGGCT GGTTAGCAGG GCTCTTCTAT CGCCACAAAT 
TCAACTCTTC AGGCTGTCCT GAGAGGTTGG CCAGCTGCCG ACGCCTTACC GATTTTGCCC 
AGGGCTGGGG TCCCATCAGT TATGCCAACG GAAGCGGCCT TGACGAACGC CCCTACTGTT 
GGCACTACCC TCCAAGACCT TGTGGCATTG TGCCCGCAAA GAGCGTGTGT GGCCCGGTAT 
ATTGCTTCAC TCCCAGCCCC GTGGTGGTGG GAACGACCGA CAGGTCGGGC GCGCCTACCT 
ACAGCTGGGG TGCAAATGAT ACGGATGTCT TCGTCCTTAA CAACACCAGG CCACCGCTGG 
GCAATTGGTT CGGTTGTACC TGGATGAACT CAACTGGATT CACCAAAGTG TGCGGAGCGC 
CCCCTTGTGT CATCGGAGGG GTGGGCAACA ACACCTTGCT CTGCCCCACT GATTGCTTCC 
GCAAACATCC GGAAGCCACA TACTCTCGGT GCGGCTCCGG TCCCTGGATT ACACCCAGGT 
GCATGGTCGA CTACCCGTAT AGGCTTTGGC ACTATCCTTG TACTATCAAT TACACCATAT 
TCAAAGTCAG GATGTACGTG GGAGGGGTCG AGCACAGGCT GGAAGCGGCC TGCAACTGGA 
CGCGGGGCGA ACGCTGTGAT CTGGAAGACA GGGACAGGTC CGAGCTCAGC CCATTGCTGC 
TGTCCACCAC ACAGTGGCAG GTCCTTCCGT GTTCTTTCAC GACCCTGCCA GCCTTGTCCA 
CCGGCCTCAT CCACCTCCAC CAGAACATTG TGGACGTGCA GTACTTGTAC GGGGTGGGGT 
CAAGCATCGC GTCCTGGGCC ATTAAGTGGG AGTACGTCGT TCTCCTGTTC CTTCTGCTTG 
CAGACGCGCG CGTCTGCTCC TGCTTGTGGA TGATGTTACT CATATCCCAA GCGGAGGCGG 
CTTTGGAGAA CCTCGTAATA CTCAATGCAG CATCCCTGGC CGGGACGCAC GGTCTTGTGT 
CCTTCCTCGT GTTCTTCTGC TTTGCGTGGT ATCTGAAGGG TAGGTGGGTG CCCGGAGCGG 
TCTACGCCTT CTACGGGATG TGGCCTCTCC TCCTGCTCCT GCTGGCGTTG CCTCAGCGGG 
CATACGCACT GGACACGGAG GTGGCCGCGT CGTGTGGCGG CGTTGTTCTT GTCGGGTTAA 
TGGCGCTGAC TCTGTCACCA TATTACAAGC GCTATATCAG CTGGTGCATG TGGTGGCTTC 
AGTATTTTCT GACCAGAGTA GAAGCGCAAC TGCACGTGTG GGTTCCCCCC CTCAACGTCC 
GGGGGGGGCG CGATGCCGTC ATCTTACTCA TGTGTGTTGT ACACCCGACT CTGGTATTTG 
ACATCACCAA ACTACTCCTG GCCATCTTCG GACCCCTTTG GATTCTTCAA GCCAGTTTGC 
TTAAAGTCCC CTACTTCGTG CGCGTTCAAG GCCTTCTCCG GATCTGCGCG CTAGCGCGGA 
AGATAGCCGG AGGTCATTAC GTGCAAATGG CCATCATCAA GTTGGGGGCG CTTACTGGCA 
CCTATGTGTA TAACCATCTC ACCCCTCTTC GAGACTGGGC GCACAACGGC CTGCGAGATC 
TGGCCGTGGC TGTGGAACCA GTCGTCTTCT CCCGAATGGA GACCAAGCTC ATCACGTGGG 
GGGCAGATAC CGCCGCGTGC GGTGACATCA TCAACGGCTT GCCCGTCTCT GCCCGTAGGG 
GCCAGGAGAT ACTGCTTGGA CCAGCCGACG GAATGGTCTC CAAGGGGTGG AGGTTGCTGG 
CGCCCATCAC GGCGTACGCC CAGCAGACGA GAGGCCTCCT AGGGTGTATA ATCACCAGCC 
TGACTGGCCG GGACAAAAAC CAAGTGGAGG GTGAGGTCCA GATCGTGTCA ACTGCTACCC 
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AAACCTTCCT GGCAACGTGC ATCAATGGGG TATGCTGGAC TGTCTACCAC GGGGCCGGAA 
CGAGGACCAT CGCATCACCC AAGGGTCCTG TCATCCAGAT GTATACCAAT GTGGACCAAG 
ACCTTGTGGG CTGGCCCGCT CCTCAAGGTT CCCGCTCATT GACACCCTGC ACCTGCGGCT 
CCTCGGACCT TTACCTGGTT ACGAGGCACG CCGACGTCAT TCCCGTGCGC CGGCGAGGTG 
ATAGCAGGGG TAGCCTGCTT TCGCCCCGGC CCATTTCCTA CCTAAAAGGC TCCTCGGGGG 
GTCCGCTGTT GTGCCCCGCG GGACACGCCG TGGGCCTATT CAGGGCCGCG GTGTGCACCC 
GTGGAGTGAC CAAGGCGGTG GACTTTATCC CTGTGGAGAA CCTAGAGACA ACCATGAGAT 
CCCCGGTGTT CACGGACAAC TCCTCTCCAC CAGCAGTGCC CCAGAGCTTC CAGGTGGCCC 
ACCTGCATGC TCCCACCGGC AGTGGTAAGA GCACCAAGGT CCCGGCTGCG TACGCAGCCC 
AGGGCTACAA GGTGTTGGTG CTCAACCCCT CTGTTGCTGC AACGCTGGGC TTTGGTGCTT 
ACATGTCCAA GGCCCATGGG GTCGATCCTA ATATCAGGAC CGGGGTGAGA ACAATTACCA 
CTGGCAGCCC CATCACGTAC TCCACCTACG GCAAGTTCCT TGCCGACGGC GGGTGCTCAG 
GAGGCGCTTA TGACATAATA ATTTGTGACG AGTGCCACTC CACGGATGCC ACATCCATCT 
TGGGCATCGG CACTGTCCTT GACCAAGCAG AGACTGCGGG GGCGAGATTG GTTGTGCTCG 
CCACTGCTAC CCCTCCGGGC TCCGTCACTG TGTCCCATCC TAACATCGAG GAGGTTGCTC 
TGTCCACCAC CGGAGAGATC CCTTTCTACG GCAAGGCTAT CCCCCTCGAG GTGATCAAGG 
GGGGAAGACA TCTCATCTTC TGTCACTCAA AGAAGAAGTG CGACGAGCTC GCCGCGAAGC 
TGGTCGCATT GGGCATCAAT GCCGTGGCCT ACTACCGCGG ACTTGACGTG TCTGTCATCC 
CGACCAACGG CGATGTTGTC GTCGTGTCGA CCGATGCTCT CATGACTGGC TTTACCGGCG 
ACTTCGACTC TGTGATAGAC TGCAACACGT GTGTCACTCA GACAGTCGAT TTCAGCCTTG 
ACCCTACCTT TACCATTGAG ACAACCACGC TCCCCCAGGA TGCTGTCTCC AGGACTCAGC 
GCCGGGGCAG GACTGGCAGG GGGAAGCCAG GCATCTACAG ATTTGTGGCA CCGGGGGAGC 
GCCCCTCCGG CATGTTCGAC TCGTCCGTCC TCTGTGAGTG CTATGACGCG GGCTGTGCTT 
GGTATGAGCT CATGCCCGCC GAGACTACAG TTAGGCTACG AGCGTACATG AACACCCCGG 
GGCTTCCCGT GTGCCAGGAC CATCTTGAAT TTTGGGAGGG CGTCTTTACG GGCCTCACCC 
ATATAGATGC CCACTTTCTA TCCCAGACAA AGCAGAGTGG GGAGAACTTT CCTTACCTGG 
TAGCGTACCA AGCCACCGTG TGCGCTAGGG CTCAAGCCCC TCCCCCATCG TGGGACCAGA 
TGTGGAAGTG TTTGATCCGC CTTAAACCCA CCCTCCATGG GCCAACACCC CTGCTATACA 
GACTGGGCGC TGTTCAGAAT GAAGTCACCC TGACGCACCC AATCACCAAA TACATCATGA 
CATGCATGTC GGCCGACCTG GAGGTCGTCA CGAGCACCTG GGTGCTCGTT GGCGGCGTCC 
TGGCTGCTCT GGCCGCGTAT TGCCTGTCAA CAGGCTGCGT GGTCATAGTG GGCAGGATTG 
TCTTGTCCGG GAAGCCGGCA ATTATACCTG ACAGGGAGGT TCTCTACCAG GAGTTCGATG 
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AGATGGAAGA GTGCTCTCAG CACTTACCGT ACATCGAGCA AGGGATGATG CTCGCTGAGC 5520 

AGTTCAAGCA GAAGGCCCTC GGCCTCCTGC AGACCGCGTC CCGCCATGCA GAGGTTATCA 55 8 0 

CCCCTGCTGT CCAGACCAAC TGGCAGAAAC TCGAGGTCTT CTGGGCGAAG CACATGTGGA 564 0 

ATTTCATCAG TGGGATACAA TATTTGGCGG GCCTGTCAAC GCTGCCTGGT AACCCCGCCA 57 00 

TTGCTTCATT GATGGCTTTT ACAGCTGCCG TCACCAGCCC ACTAACCACT GGCCAAACCC 57 60 

TCCTCTTCAA CATATTGGGG GGGTGGGTGG CTGCCCAGCT CGCCGCCCCC GGTGCCGCTA 582 0 

CCGCCTTTGT GGGCGCTGGC TTAGCTGGCG CCGCCATCGG CAGCGTTGGA CTGGGGAAGG 58 8 0 

TCCTCGTGGA CATTCTTGCA GGGTATGGCG CGGGCGTGGC GGGAGCTCTT GTAGCATTCA 594 0 

AGATCATGAG CGGTGAGGTC CCCTCCACGG AGGACCTGGT CAATCTGCTG CCCGCCATCC 6000 

TCTCGCCTGG AGCCCTTGTA GTCGGTGTGG TCTGCGCAGC AATACTGCGC CGGCACGTTG 6060 

GCCCGGGCGA GGGGGCAGTG CAATGGATGA ACCGGCTAAT AGCCTTCGCC TCCCGGGGGA 6120 

ACCATGTTTC CCCCACGCAC TACGTGCCGG AGAGCGATGC AGCCGCCCGC GTCACTGCCA 618 0 

TACTCAGCAG CCTCACTGTA ACCCAGCTCC TGAGGCGACT ACATCAGTGG ATAAGCTCGG ' 624 0 

AGTGTACCAC TCCATGCTCC GGCTCCTGGC TAAGGGACAT CTGGGACTGG ATATGCGAGG 6300 

TGCTGAGCGA CTTTAAGACC TGGCTGAAAG CCAAGCTCAT GCCACAACTG CCTGGGATTC 6360 

CCTTTGTGTC CTGCCAGCGC GGGTATAGGG GGGTCTGGCG AGGAGACGGC ATTATGCACA 642 0 

CTCGCTGCCA CTGTGGAGCT GAGATCACTG GACATGTCAA AAACGGGACG ATGAGGATCG 648 0 

TCGGTCCTAG GACCTGCAGG AACATGTGGA GTGGGACGTT CCCCATTAAC GCCTACACCA 6540 

CGGGCCCCTG TACTCCCCTT CCTGCGCCGA ACTATAAGTT CGCGCTGTGG AGGGTGTCTG 6600 

CAGAGGAATA CGTGGAGATA AGGCGGGTGG GGGACTTCCA CTACGTATCG GGTATGACTA 6660 

CTGACAATCT TAAATGCCCG TGCCAGATCC CATCGCCCGA ATTTTTCACA GAATTGGACG 672 0 

GGGTGCGCCT ACATAGGTTT GCGCCCCCTT GCAAGCCCTT GCTGCGGGAG GAGGTATCAT 67 8 0 

TCAGAGTAGG ACTCCACGAG TACCCGGTGG GGTCGCAATT ACCTTGCGAG CCCGAACCGG 684 0 

ACGTAGCCGT GTTGACGTCC ATGCTCACTG ATCCCTCCCA TATAACAGCA GAGGCGGCCG 6900 

GGAGAAGGTT GGCGAGAGGG TCACCCCCTT CTATGGCCAG CTCCTCGGCC AGCCAGCTGT 6960 

CCGCTCCATC TCTCAAGGCA ACTTGCACCG CCAACCATGA CTCCCCTGAC GCCGAGCTCA 7020 

TAGAGGCTAA CCTCCTGTGG AGGCAGGAGA TGGGCGGCAA CATCACCAGG GTTGAGTCAG 7080 

AGAACAAAGT GGTGATTCTG GACTCCTTCG ATCCGCTTGT GGCAGAGGAG GATGAGCGGG 7140 

AGGTCTCCGT ACCCGCAGAA ATTCTGCGGA AGTCTCGGAG ATTCGCCCGG GCCCTGCCCG 7200 

TTTGGGCGCG GCCGGACTAC AACCCCCCGC TAGTAGAGAC GTGGAAAAAG CCTGACTACG 72 60 

AACCACCTGT GGTCCATGGC TGCCCGCTAC CACCTCCACG GTCCCCTCCT GTGCCTCCGC 7320 

CTCGGAAAAA GCGTACGGTG GTCCTCACCG AATCAACCCT ACCTACTGCC TTGGCCGAGC 7380 
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TTGCCACCAA AAGTTTTGGC AGCTCCTCAA CTTCCGGCAT TACGGGCGAC AATATGACAA 7440 

CATCCTCTGA GCCCGCCCCT TCTGGCTGCC CCCCCGACTC CGACGTTGAG TCCTATTCTT 7500 

CCATGCCCCC CCTGGAGGGG GAGCCTGGGG ATCCGGATTT CAGCGACGGG TCATGGTCGA 7560 

CGGTCAGTAG TGGGGCCGAC ACGGAAGATG TCGTGTGCTG CTCAATGTCT TATACCTGGA 7 620 

CAGGCGCACT CGTCACCCCG TGCGCTGCGG AAGAACAAAA ACTGCCCATC AACGCACTGA 7 680 

GCAACTCGTT GCTACGCCAT CACAATCTGG TATATTCCAC CACTTCACGC AGTGCTTGCC 7740 

AAAGGCAGAA GAAAGTCACA TTTGACAGAC TGCAAGTTCT GGACAGCCAT TACCAGGACG 7 800 

TGCTCAAGGA GGTCAAAGCA GCGGCGTCAA AAGTGAAGGC TAACTTGCTA TCCGTAGAGG 7 8 60 

AAGCTTGCAG CCTGACGCCC CCACATTCAG CCAAATCCAA GTTTGGCTAT GGGGCAAAAG 7 920 

ACGTCCGTTG CCATGCCAGA AAGGCCGTAG CCCACATCAA CTCCGTGTGG AAAGACCTTC 7 98 0 

TGGAAGACAG TGTAACACCA ATAGACACTA TCATCATGGC CAAGAACGAG GTCTTCTGCG 8 04 0 

TTCAGCCTGA GAAGGGGGGT CGTAAGCCAG CTCGTCTCAT CGTGTTCCCC GACCTGGGCG 8100 

TGCGCGTGTG CGAGAAGATG. GCCCTGTACG ACGTGGTTAG CAAACTCCCC CTGGCCGTGA 8160 

TGGGAAGCTC CTACGGATTC CAATACTCAC CAGGACAGCG GGTTGAATTC CTCGTGCAAG 822 0 

CGTGGAAGTC CAAGAAGACC CCGATGGGGT TCCCGTATGA TACCCGCTGT TTTGACTCCA 82 80 

CAGTCACTGA GAGCGACATC CGTACGGAGG AGGCAATTTA CCAATGTTGT GACCTGGACC 834 0 

CCCAAGCCCG CGTGGCCATC AAGTCCCTCA CTGAGAGGCT TTATGTTGGG GGCCCTCTTA 84 00 

CCAATTCAAG GGGGGAAAAC TGCGGCTATC GCAGGTGCCG CGCGAGCGGC GTACTGACAA 8460 

CTAGCTGTGG TAACACCCTC ACTTGCTACA TCAAGGCCCG GGCAGCCCGT CGAGCCGCAG 8520 

GGCTCCAGGA CTGCACCATG CTCGTGTGTG GCGACGACTT AGTCGTTATC TGTGAAAGTG 858 0 

CGGGGGTCCA GGAGGACGCG GCGAGCCTGA GAGCCTTTAC GGAGGCTATG ACCAGGTACT 8 64 0 

CCGCCCCCCC CGGGGACCCC CCACAACCAG AATACGACTT GGAGCTTATA ACATCATGCT 87 00 

CCTCCAACGT GTCAGTCGCC CACGACGGCG CTGGAAAAAG GGTCTACTAC CTTACCCGTG 87 60 

ACCCTACAAC CCCCCTCGCG AGAGCCGCGT GGGAGACAGC AAGACACACT CCAGTCAATT 8 820 

CCTGGCTAGG CAACATAATC ATGTTTGCCC CCACACTGTG GGCGAGGATG ATACTGATGA 8880 

CCCATTTCTT TAGCGTCCTC ATAGCCAGGG ATCAGCTTGA ACAGGCTCTT AACTGTGAGA 8 94 0 

TCTACGCAGC CTGCTACTCC ATAGAACCAC TGGATCTACC TCCAATCATT CAAAGACTCC 9000 

ATGGCCTCAG CGCATTTTTA CTCCACAGTT ACTCTCCAGG TGAAGTCAAT AGGGTGGCCG 9060 

CATGCCTCAG AAAACTTGGG GTCCCGCCCT TGCGAGCTTG GAGACACCGG GCCCGGAGCG 912 0 

TCCGCGCTAG GCTTCTGTCC AGGGGAGGCA GGGCTGCCAT ATGTGGCAAG TACCTCTTCA 918 0 

ACTGGGCAGT AAGAACAAAG CTCAAACTCA CTCCAATAGC GGCCGCTGGC CGGCTGGACT 9240 

TGTCCGGTTG GTTCACGGCT GGCTACAGCG GGGGAGACAT TTATCACAGC GTGTCTCATG 9300 
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CCCGGCCCCG CTGGTTCTGG TTTTGCCTAC TCCTGCTCGC TGCAGGGGTA GGCATCTACC 9360 
TCCTCCCCAA CCGGTGAACG GGGAGCTAGA CACTCCGGCC T 94 01 

(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
GGACTAGTCT GCAGTCTAGA GCTCCATGGC GCCCATCACG GCGTACG 4 7 

(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 
CGGCTGCACC TCCAGCAGTG CATTTTAGAT CTTAAGTCTA GAAG 44 
(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 
AAGCTACTCT ACCTTCTCAC GATTTTAGAT CTTAAGTCTA GAAG 44 
(2) INFORMATION FOR SEQ ID NO: 5: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 45 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI -SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5; 
GGATCTCTGT TGGTACTCTA GGATTTTAGA TCTTAAGTCT AGAAG 
(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 64 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "OLIGONUCLEOTIDE DUPLEX" 

(iii) HYPOTHETICAL: NO 

( iv) ANTI-SENSE : NO 

(V) FRAGMENT TYPE: internal 

(ix) FEATURE: 

(A) NAME/KEY; misc_f eature 

(B) LOCATION: 1 . . 4 

(D) OTHER INFORMATION: /product= "SINGLE STRANDED REGION 
ON CODING STRAND" 

(ix) FEATURE: 

(A) NAME/KEY: mi sc_f eature 

(B) LOCATION: 61.. 64 

(D) OTHER INFORMATION: /product= "SINGLE STRANDED REGION 
ON COMPLEMENTARY STRAND" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

TGCACGGCGC CGACACGGAA GATGTCGTGT GCTGCTCAAT GTCTTATACC TGGACAGGCG 

TGCA 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
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(V) FRAGMENT TYPE: internal 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



Gly Ala Asp Thr Glu Asp Val Val Cys Cys Ser Met Ser Tyr Thr Trp 
15 10 15 



Thr Gly Val His 
20 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 47 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY; linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 
GGACTAGTCT GCAGTCTAGA GCTCCATGGC GCCCATCACG GCGTACG 47 
(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 27 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
CGGGAGCCGG AGGACGTCTG GCGCAGG 27 
(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1497 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 
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(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 87.. 893 

(ix) FEATURE: 

(A) NAME/KEY: misc_feature 

(B) LOCATION: 426.. 427 

(D) OTHER INFORMATION: /label= ApaLIsite 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

ACCAACCTCT TCGAGGCACA AGGCACAACA GGCTGCTCTG GGATTCTCTT CAGCCAATCT 60 

TCATTGCTCA AGTGTCTGAA GCAGCC ATG GCA GAA GTA CCT GAG CTC GCC AGT 113 

Met Ala Glu Val Pro Glu Leu Ala Ser 
1 5 

GAA ATG ATG GCT TAT TAG AGT GGC AAT GAG GAT GAC TTG TTC TTT GAA 161 
Glu Met Met Ala Tyr Tyr Ser Gly Asn Glu Asp Asp Leu Phe Phe Glu 
10 15 20 25 

GCT GAT GGC CCT AAA CAG ATG AAG TGC TCC TTC CAG GAC CTG GAC CTC 209 
Ala Asp Gly Pro Lys Gin Met Lys Cys Ser Phe Gin Asp Leu Asp Leu 
30 35 40 

TGC CCT CTG GAT GGC GGC ATC CAG CTA CGA ATC TCC GAC CAC CAC TAG 257 
Cys Pro Leu Asp Gly Gly lie Gin Leu Arg lie Ser Asp His His Tyr 
45 50 55 

AGC AAG GGC TTC AGG CAG GCC GCG TCA GTT GTT GTG GCC ATG GAC AAG 305 
Ser Lys Gly Phe Arg Gin Ala Ala Ser Val Val Val Ala Met Asp Lys 
60 65 70 

CTG AGG AAG ATG CTG GTT CCC TGC CCA CAG ACC TTC CAG GAG AAT GAC 353 
Leu Arg Lys Met Leu Val Pro Cys Pro Gin Thr Phe Gin Glu Asn Asp 
75 80 85 

CTG AGC ACC TTC TTT CCC TTC ATC TTT GAA GAA GAA CCT ATC TTC TTC 4 01 

Leu Ser Thr Phe Phe Pro Phe lie Phe Glu Glu Glu Pro lie Phe Phe 
90 95 100 105 

GAC ACA TGG GAT AAC GAG GCT TAT GTG CAC GAT GCA CCT GTA CGA TCA 44 9 

Asp Thr Trp Asp Asn Glu Ala Tyr Val His Asp Ala Pro Val Arg Ser 
110 115 120 

CTG AAC TGC ACG CTC CGG GAC TCA CAG CAA AAA AGC TTG GTG ATG TCT 4 97 

Leu Asn Cys Thr Leu Arg Asp Ser Gin Gin Lys Ser Leu Val Met Ser 
125 130. 135 

GGT CCA TAT GAA CTG AAA GCT CTC CAC CTC CAG GGA CAG GAT ATG GAG 545 
Gly Pro Tyr Glu Leu Lys Ala Leu His Leu Gin Gly Gin Asp Met Glu 
140 145 150 

CAA CAA GTG GTG TTC TCC ATG TCC TTT GTA CAA GGA GAA GAA AGT AAT 5 93 

Gin Gin Val Val Phe Ser Met Ser Phe Val Gin Gly Glu Glu Ser Asn 
155 160 165 

GAC AAA ATA CCT GTG GCC TTG GGC CTC AAG GAA AAG AAT CTG TAC CTG 641 
Asp Lys lie Pro Val Ala Leu Gly Leu Lys Glu Lys Asn Leu Tyr Leu 
170 175 180 185 

TCC TGC GTG TTG AAA GAT GAT AAG CCC ACT CTA CAG CTG GAG AGT GTA 68 9 
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Ser Cys Val Leu Lys Asp Asp Lys Pro Thr Leu Gin Leu Glu Ser Val 
190 195 200 

GAT CCC AAA AAT TAG CCA AAG AAG AAG ATG GAA AAG CGA TTT GTC TTC 737 
Asp Pro Lys Asn Tyr Pro Lys Lys Lys Met Glu Lys Arg Phe Val Phe 
205 210 215 

AAC AAG ATA GAA ATC AAT AAC AAG CTG GAA TTT GAG TCT GCC CAG TTC 7 85 

Asn Lys lie Glu lie Asn Asn Lys Leu Glu Phe Glu Ser Ala Gin Phe 
220 225 230 

CCC AAC TGG TAC ATC AGC ACC TCT CAA GCA GAA ?JKC ATG CCC GTC TTC 833 
Pro Asn Trp Tyr lie Ser Thr Ser Gin Ala Glu Asn Met Pro Val Phe 
235 240 245 

CTG GGA GGG ACC AAA GGC GGC CAG GAT ATA ACT GAC TTC ACC ATG CAA 8 81 

Leu Gly Gly Thr Lys Gly Gly Gin Asp lie Thr Asp Phe Thr Met Gin 
250 255 260 265 

TTT GTG TCT TCC TAAAGAGAGC TGTACCCAGA GAGTCCTGTG CTGAATGTGG 933 
Phe Val Ser Ser 

ACTCAATCCC TAGGGCTGGC AGAAAGGGAA CAGAAAGGTT TTTGAGTACG GCTATAGCCT 993 

GGACTTTCCT GTTGTCTACA CCAATGCCCA ACTGCCTGCC TTAGGGTAGT GCTAAGAGGA 1053 

TCTCCTGTCC ATCAGCCAGG ACAGTCAGCT CTCTCCTTTC AGGGCCAATC CCCAGCCCTT 1113 

TTGTTGAGCC AGGCCTCTCT CACCTCTCCT ACTCACTTAA AGCCCGCCTG ACAGAAACCA 1173 

CGGCCACATT TGGTTCTAAG AAACCCTCTG TCATTCGCTC CCACATTCTG ATGAGCAACC 1233 

GCTTCCCTAT TTATTTATTT ATTTGTTTGT TTGTTTTATT CATTGGTCTA ATTTATTCAA 1293 

AGGGGGCAAG AAGTAGCAGT GTCTGTAAAA GAGCCTAGTT TTTAATAGCT ATGGAATCAA 1353 

TTCAATTTGG ACTGGTGTGC TCTCTTTAAA TCAAGTCCTT TAATTAAGAC TGAAAATATA 1413 

TAAGCTCAGA TTATTTAAAT GGGAATATTT ATAAATGAGC AAATATCATA CTGTTCAATG 147 3 

GTTCTGAAAT AAACTTCTCT GAAG 14 97 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 269 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

<Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Ala Glu Val Pro Glu Leu Ala Ser Glu Met Met Ala Tyr Tyr Ser 

15 10 15 ' 

Gly Asn Glu Asp Asp Leu Phe Phe Glu Ala Asp Gly Pro Lys Gin Met 
20 25 30 

Lys Cys Ser Phe Gin Asp Leu Asp Leu Cys Pro Leu Asp Gly Gly lie 
35 40 45 
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Gln Leu Arg He Ser Asp His His Tyr Ser Lys Gly Phe Arg Gin Ala 
50 55 60 

Ala Ser Val Val Vai Ala Met Asp Lys Leu Arg Lys Met Leu Val Pro 
65 70 75 80 

Cys Pro Gin Thr Phe Gin Glu Asn Asp Leu Ser Thr Phe Phe Pro Phe 
85 90 95 

He Phe Glu Glu Glu Pro He Phe Phe Asp Thr Trp Asp Asn Glu Ala 
100 105 110 

Tyr Val His Asp Ala Pro Val Arg Ser Leu Asn Cys Thr Leu Arg Asp 
115 120 125 

Ser Gin Gin Lys Ser Leu Val Met Ser Gly Pro Tyr Glu Leu Lys Ala 
130 135 140 

Leu His Leu Gin Gly Gin Asp Met Glu Gin Gin Val Val Phe Ser Met 
145 150 155 160 

Ser Phe Val Gin Gly Glu Glu Ser Asn Asp Lys He Pro Val Ala Leu 
165 170 175 

Gly Leu Lys Glu Lys Asn Leu Tyr Leu Ser Cys Val Leu Lys Asp Asp 
180 185 190 

Lys Pro Thr Leu Gin Leu Glu Ser Val Asp Pro Lys Asn Tyr Pro Lys 
195 200 205 

Lys Lys Met Glu Lys Arg Phe Val Phe Asn Lys He Glu He Asn Asn 
210 215 220 

Lys Leu Glu Phe Glu Ser Ala Gin Phe Pro Asn Trp Tyr He Ser Thr 
225 230 235 240 

Ser Gin Ala Glu Asn Met Pro Val Phe Leu Gly Gly Thr Lys Gly Gly 
245 250 255 

Gin Asp He Thr Asp Phe Thr Met Gin Phe Val Ser Ser 
260 265 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 36 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
CTCGGCCTCC TGCAGGCACC TGTACGATCA CTGAAC 
(2) INFORMATION FOR SEQ ID NO: 13: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 32 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "oligonucleotide primer" 

(iii) HYPOTHETICAL: NO 

(iv) ANTI-SENSE: NO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GTTAAACACA GAAGGATTTT AGATCTTAAG GG 
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I claim: 



1 . A method for assaying exogenous protease 
activity in a host cell comprising the steps of: 

(a) incubating a host cell transformed with a 
first nucleotide sequence encoding an exogenous protease and a 
second nucleotide sequence encoding an artificial polypeptide 
substrate; 

wherein said substrate comprises: 

(i) a cleavage site for said exogenous 

protease; and 

(ii) a polypeptide that is secreted out of 
said cell following cleavage by said exogenous protease; 

under conditions which cause said exogenous protease and said 
artificial substrate to be expressed; 

(b) separating said host cell from its growth 
media under non-lytic conditions; and 

(c) assaying said growth media for the 
presence of said secreted polypeptide. 

2. A method for assaying endogenous protease 
activity in a host cell comprising the steps of: 

(a) incubating a host cell transformed with a 
nucleotide sequence encoding an artificial polypeptide 
substrate; 

wherein said substrate comprises: 

(i) a cleavage site for said endogenous 

protease; and 

(ii) a polypeptide that is secreted out of 
said cell following cleavage by said endogenous protease; 

under conditions which cause said artificial substrate to be 
expressed; 

(b) separating said host cell from its growth 
media under non-lytic conditions; and 

(c) assaying said growth media for the 
presence of said secreted polypeptide. 
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3. A method for identifying a compound as an 
inhibitor of a protease comprising the steps of: 

(a) assaying the activity of a protease in the 
absence of said compound by a method according to claim 1 or 
2; 

(b) assaying the activity of a protease in the 
presence of said compound by a method according to claim 1 or 

2, wherein said compound is added to the host cells during 
said incubation of said host cells; and 

(c) comparing the results of step (a) with the 
results of step (b) . 

4. The method according to claim 1 or claim 3, 
insofar as it depends from claim 1, wherein said first 
nucleotide sequence and said second nucleotide sequence encode 
a single polypeptide, 

5. The method according to claim 4, wherein said 
first and second nucleotide sequences encode NS3-4A-A4B-IL-lIi . 

6. The method according to any one of claims 1 to 

3, wherein said first nucleotide sequence encodes a viral 
protease or an enzymatically active fragment thereof. 

7. The method according to claim 6, wherein said 
first nucleotide sequence encodes hepatitis C virus NS3 
protease, an NS3-4A fusion protein or amino acids 1-180 of NS3 
protease . 

8. The method according to any one of claims 1 to 
3, wherein said secreted polypeptide is selected from 
polypeptides comprising mature IL-IB/ mature IL-la, basic 
fibroblast growth factor and endothelial-monocyte activating 
polypeptide II. 

9. The method according to claim 8, wherein said 
secreted polypeptide comprises mature IL-lf3. 
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10. The method according to claim 9, wherein said 
artificial polypeptide substrate is selected from pre-IL-lfi* 
or pre-IL-lB (CSM) , 

11. A host cell transformed with a nucleotide 
sequence encoding an artificial polypeptide substrate, wherein 
said substrate comprises: 

(a) a cleavage site for said exogenous 

protease; and 

(b) a polypeptide that is secreted out of said 
cell following cleavage by said exogenous protease; 

said host cell being capable of expressing said protease and 
said substrate. 

12. A host cell transformed with a first nucleotide 
sequence encoding an exogenous protease and a second 
nucleotide sequence encoding an artificial polypeptide 
substrate, wherein said substrate comprises: 

(a) a cleavage site for said exogenous 

protease; and 

(b) a polypeptide that is secreted out of -said 
cell following cleavage by said exogenous protease; 

said host cell being capable of expressing said protease and 
said substrate. 

13. The host cell according to claim 11 or 12, 
wherein said secreted polypeptide is selected from 
polypeptides comprising mature IL-IB, mature IL-la, basic 
fibroblast growth factor and endothelial-monocyte activating 
polypeptide II. 

14. The host cell according to claim 13, wherein 
said secreted polypeptide comprises mature IL-li5. 



SUBSTITUTE SHEET (RULE 26) 



wo 96/34976 PCT/US96/06070 

-35- 

15. The host cell according to claim 14, wherein 
said artificial polypeptide substrate is selected from pre-IL- 
IB* or pre-IL-lfi (CSM) . 

16. The host cell according to claim 12, wherein 
said first nucleotide sequence and said second nucleotide 
sequence encode a single polypeptide. 

17. The host cell according to claim 16, wherein 
said first and second nucleotide sequences encode NS3-4A-A4B- 
IL-15. 

18. The host cell according to claim 12, wherein 
said first nucleotide sequence encodes a viral protease or an 
enzymatically active fragment thereof. 

19. The host cell according to claim 18, wherein 
said first nucleotide sequence encodes hepatitis C virus NS3 
protease, an NS3-4A fusion protein or amino acids 1-180 of NS3 
protease , 

20. The host cell according to claim 11 or 12, 
selected from E. coli . Bacillus , other bacteria, yeast and 
other fungi, plant cells, insect cells, mammalian cells. 

21. The host cell according to claim 20, wherein 
said host cell is a mammalian cell. 

22. The host cell according to claim 21, wherein 
said host cell is a COS cell. 



23. A recombinant DNA molecule comprising a DNA 
sequence encoding an artificial substrate selected from pre- 
IL-lf3* and pre-IL-lJi (CSM) . 
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